Speech Perception and Speech Production
نویسنده
چکیده
Perception and Production of Syllable-Initial English [r] and [1] by English and Japanese Speakers by Elzbieta B. Slawinski The Psychology Department, The University of Calgary Introduction. V arious studies have investigated a contribution of multiple acoustic cues to the perceptual distinction of [r-1] phonetic contrast in English (e.g., Dalston, 1975; Underbakke and Polka, 1988). The results of these studies showed that both spectral and temporal properties facilitate a distinction between [r] and [1] sounds. The onset frequency and transition of F3 in relation to F2 is a prim ary spectra l d ifference needed for differentiation of [r] and [1] by English listeners. For [r], the F3 frequency at onset is low and therefore close to the F2 onset, while in the case of [1], F3 onset is high relative to F2 onset. Moreover, a short FI transition duration is present at [1] sound and the long FI transition corresponds to [r] sound in an initial syllable position. Thus, the presence of both spectral and temporal acoustic cues is important for [r] and [1] distinction in prevocalic position. Underbakke and Polka (1988) demonstrated that a trading relation exists between these cues. Thus, in order to enhance the perception of [r] and [1], the perceptual effects of changing one acoustic cue could be offset by changing the other cue in the opposing direction. The trading relation between temporal and spectral cues for the [r] and [1] contrast depends on lan g u ag e-u n iv e rsa l p h o n e tic p ro cess in g constraints, and may be modified in second language acquisition. As the Japanese language does not have a contrast between [r] and [1] in prevocalic position, these sounds are very difficult to be discriminated, both perceptually and productively, by Japanese adults. Japanese speakers, unlike native English speakers, do not perceive a synthesized [r-1] continuum categorically, and they do not make a distinction between those two sounds productively (Yamada and Tohkura, 1990). This study investigated how native speakers of Japanese, who are living in Canada for many years, perceive and produce Canadian English. Method. 1. Subjects. Ten native female speakers of Canadian English (age 20-35 years), and ten Japanese female speakers (age 20-53 years), who were residing in Canada, served as subjects. Japanese subjects could be divided into two categories: 2 females who started to acquire English in Canada at an age of 5 years; 8 females, whose first contact with English was in Japanese school at age of 12 years. 2. Stimuli. Two synthetic series of nine stimuli each, were generated using parallel/cascade synthesizer KLSYN88a. These series were interpolated in the same steps on the spectral dimension of F2 and F3 onset frequency from "rake" to "lake", but differed on the temporal dimension: one series "r-cue" carrying a temporal pattern typical for [r] sound, and the second series "1-cue" with a temporal pattern typical for [1] sound. Out of these series, the oddity discrimination tests were prepared (Underbakke and Polka, 1988). In each test six repetitions of six stimulus pairs were presented in triads; two stimuli were the same and one was different. All pairs were three steps apart on the spectral dimension. Four types of stimulus comparisons were prepared; a) one cue spectral-"l cue" (varying along spectral dimension with fixed T-temporal pattern), b) one cue spectral-"r cue" (varying along spectral dimension with fixed ’r'-temporal pattern), c) twocue facilitating (changes in temporal dimension enhanced phonetic discrimination), d) two-cue conflicting (changes in temporal dimension suppressed phonetic discrimination). 3. Procedure. Subjects were tested individually on all four oddity discrimination tasks presented in a form of the computer game. Stimuli were presented via loudspeakers at approximately 70 dBA. During a production test, each subject was asked to produce three times the words "rake" and "lake". Recordings were made in an anechoic room, and speech samples were recorded on tape using a microphone B&K 4165, and DAT recorder, SONY DAT-75ES. The recorded speech samples were digitized at a 40 kHz sampling frequency with 16-bit amplitude accuracy. Speech samples were down-sampled to 10 kHz, and the formant frequency trajectories were estimated by an LPC formant tracking method. Results and discussion. The pooled discrimination functions for the English speakers, presented on Figure 1, almost replicate the findings of Underbakke and Polka (1988), as performances in four oddity discrimination tasks are ordered: two-cue facilitating> one cue 'r* = one cue T >two-cue conflicting. Such order of performances reflects the perceptual equivalence of spectral and temporal cues.
منابع مشابه
Relationship between Working Memory, Auditory Perception and Speech Intelligibility in Cochlear Implanted Children of Elementary School
Objectives: This study examined the relationship between working and short-term memory performance, and their effects on cochlear implant outcomes (speech perception and speech production) in cochlear implanted children aged 7-13 years. The study also compared the memory performance of cochlear implanted children with their normal hearing peers. Methods: Thirty-one cochlear impl...
متن کاملCorrelation between Auditory Spectral Resolution and Speech Perception in Children with Cochlear Implants
Background: Variability in speech performance is a major concern for children with cochlear implants (CIs). Spectral resolution is an important acoustic component in speech perception. Considerable variability and limitations of spectral resolution in children with CIs may lead to individual differences in speech performance. The aim of this study was to assess the correlation between auditory ...
متن کاملReliability of Interaural Time Difference-Based Localization Training in Elderly Individuals with Speech-in-Noise Perception Disorder
Background: Previous studies have shown that interaural-time-difference (ITD) training can improve localization ability. Surprisingly little is, however, known about localization training vis-à-vis speech perception in noise based on interaural time difference in the envelope (ITD ENV). We sought to investigate the reliability of an ITD ENV-based training program in speech-in-noise perception a...
متن کاملPersian Cued Speech: The Effect on the Perception of Persian Language Phonemes and Monosyllabic Words with and without Sound in Hearing Impaired Children
Objectives: This paper studies the effect of Persian Cued Speech on the perception of Persian language phonemes and monosyllabic words with and without sound in hearing impaired children. Cued Speech is a sound based mode of communication for hearing impaired people that is comprised of a limited series of hand complements and the normal pattern of speech. And it is shown that it effectively ca...
متن کاملEffect of signal to noise ratio on the speech perception ability of older adults
Background: Speech perception ability depends on auditory and extra-auditory elements. The signal-to-noise ratio (SNR) is an extra-auditory element that has an effect on the ability to normally follow speech and maintain a conversation. Speech in noise perception difficulty is a common complaint of the elderly. In this study, the importance of SNR magnitude as an extra-auditory effect on speech...
متن کامللبخوانی و ادراک گفتار دانشآموزان کمشنوای مدارس ویژۀ کمشنوایان در شهر تهران
Objective: The goal of this study was to evaluate the lip reading ability and Speech perception of hearing impaired students of special schools for the hearing impaired in different speech levels. Materials & Methods: In this cross- sectional study, 44 deaf students (9-12 years old) were selected with multi-stage cluster sampling method, from two special schools for the deaf in Tehran. Tools...
متن کامل